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Background Genome-wide association studies have identified hundreds of genetic variants associated with specific can- 
cers. A few of these risk regions have been associated with more than one cancer site; however, a system- 
atic evaluation of the associations between risk variants for other cancers and lung cancer risk has yet to be 
performed. 

Methods We included 18023 patients with lung cancer and 60543 control subjects from two consortia. Population 
Architecture using Genomics and Epidemiology (PAGE) and Transdisciplinary Research in Cancer of the Lung 
(TRICL). We examined 165 single-nucleotide polymorphisms (SNPs) that were previously associated with at least 
one of 16 non-lung cancer sites. Study-specific logistic regression results underwent meta-analysis, and associa- 
tions were also examined by race/ethnicity, histological cell type, sex, and smoking status. A Bonferroni-corrected 
P value of 2.5 X 10^^ was used to assign statistical significance. 

Results The breast cancer SNP LSPl rs381 7198 was associated with an increased risk of lung cancer (odds ratio [OR] = 1.10; 

95% confidence interval [CI] = 1.05 to 1.14; P = 2.8 x 10"^). This association was strongest for women with adeno- 
carcinoma (P= 1.2x10") and not statistically significant in men (P= .14) with this cell type (Ptietbysex= -10). Two 
glioma risk variants, r£Rrrs2853676 and CDKN2BAS1 rs4977756, which are located in regions previously associ- 
ated with lung cancer, were associated with increased risk of adenocarcinoma (OR = 1.16; 95% CI = 1.10 to 1.22; 
P= 1.1 x10-s) and squamous cell carcinoma (OR = 1.13; CI = 1.07 to 1.19; P= 2.5x10-^), respectively. 

Conclusions Our findings demonstrate a novel pleiotropic association between the breast cancer LSP1 risk region marked by 
variant rs3817198 and lung cancer risk. 
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Globally, lung cancer is the most common malignancy and cause 
of cancer-related deaths (1,2). Tobacco smoking is the primary 
risk factor for lung cancer, but there is evidence that genetic sus- 
ceptibility plays a role. Notably, recent genome-wide association 
studies (GWASs) of lung cancer have identified single-nucleotide 
polymorphisms (SNPs) in at least 10 independent loci (P < 5 x 10"^) 
influencing risk in different populations (3). However, these identi- 
fied loci explain only a small fraction of lung cancer susceptibility 
and the challenge remains to identify the many additional common 
risk loci that are expected to have small genetic effects (3). 

To date, more than 400 SNPs have been associated with cancer 
in GWASs (3). The discovery of pleiotropic effects, where a sin- 
gle gene variant is associated with more than one phenotype, may 



allow for the identification of shared disease pathways. For cancer, 
this may ultimately lead to the detection of susceptible individuals 
as well as in the development of regimens for the prevention of 
multiple cancers and pathway-based treatment. Genetic variants at 
chromosome 8q24, in TP53, and in TERT, the telomerase reverse 
transcriptase gene, are examples of loci with pleiotropic effects for 
multiple cancer sites and other chronic diseases (4-6). For lung 
cancer, a systematic evaluation of possible pleiotropic associations 
for the many risk variants identified with other cancer sites has yet 
to be conducted. 

These genetic associations may have been missed in prior 
GWASs of lung cancer due to the heavy multiple compari- 
son penalty in surveying the entire genome or due to disease 
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heterogeneity in factors such as histological cell types or smok- 
ing status. For example, TEi?Trs2736100 (7-9) has been primarily 
associated with risk of adenocarcinoma of the lung, often diag- 
nosed among nonsmokers, whereas SNPs in the 15q25 region, 
which include nicotinic acetylcholine receptor genes involved in 
nicotine dependence, have been primarily associated with lung 
cancer among smokers (10). 

Here, we examined the pleiotropic effects of 165 risk variants 
initially identified for other cancers on lung cancer risk. Our study 
included a collaboration between two large consortia (11,12), in 
which we also examined the consistency of associations by race/ 
ethnicity, tumor histology, sex, and smoking status. 

Methods 

Study Participants 

Two consortia contributed data to this smdy: the Population 
Architecture using Genomics and Epidemiology (PAGE) (12) and 
the TransdiscipUnary Research in Cancer of the Lung (TRICL) 
(1 1), which is part of the Genetic Associations and AlEchanisms in 
ONcology (GAME-ON) consortium, and is associated with the 
International Lung Cancer Consortium (ILCCO). This collaboration 
provided information on 18023 patients with lung cancer and 60543 
control subjects from 13 studies (Supplementary Table 1, available 
online). Details regarding these participating studies are described in 
the Supplementary Data (available online). All studies were based on 
primary incident nonsarcoma and nonlymphoma lung cancer cases, 
and more than 95% of the cases were pathologically confirmed. The 
majority of these studies utilized patients and control subjects who had 
no history of another cancer. Among the few studies in which a small 
proportion of patients and control subjects had a history of another 
cancer, our findings were similar when excluding these participants. 
Participants' informed consent and institutional review board approval 
was obtained for all studies except Epidemiologic Architecture 
for Genes Linked to Environment, which accesses the Vanderbilt 
University biorepository (EAGLE-BioVU), which is considered non- 
human subjects research due to sample de-identification (13). 

SNP Selection and Genotyping 

A total of 165 SNPs associated with 16 malignancies excluding lung 
cancer and smoking-related SNPs were selected as of January 2010 
from the National Human Genome Research Institute GWAS cata- 
log (3) and review of the cancer GWASs and fine-mapping literature 
review (Supplementary Tible 3, available online). Additionally, we 
studied 18 lung cancer risk variants to replicate their associations with 
lung cancer risk in populations of European ancestry (Supplementary 
Table 2, available online) (11). The risk allele for each SNP was defined 
as the allele associated with an increased risk of cancer in the initial 
report. For PAGE, candidate SNP genotyping was perfonned using 
Illumina BeadXpress (Women's Health Initiative [WHI]), Sequenom 
(EAGLE-BioVU), and theTaqMan OpenArray platform (Multiethnic 
Cohort study [MFC]). Atherosclerosis Risk in Communities Study 
[ARIC] (in PAGE) and TRICL extracted genotypes from GWAS 
data and were comprised of only European-ancestry populations. 
The ARIC samples were genotyped using the Affymetrix 6.0 plat- 
form. Genotypes were called with Birdseed and only SNPs with call 
rate equal to or greater than 90%, MAF equal to or greater than 1%, 



and Hardy- Weinberg equilibrium P > 1x10^ were considered for 
imputation. Untyped and missing SNPs were imputed using Machl 
vl. 00.16 based on HapMap release 2 (build 36) and a European ances- 
try (CEU) reference panel (14). Imputed SNPs with a quafity thresh- 
old of greater than or equal to 0.3 were included in this analysis. 
MEC, EAGLE-BioVU, and WHI could not impute missing SNPs 
due to the reduced number of variants genotyped. For TRICL, geno- 
typing was performed using the Illumina HumanHap300 BeadChips 
or Human Hap550 or 610 Quad arrays. At the time of this analysis, 
imputed SNPs were not available for TRICL. 

All PAGE studies, with the exception of ARIC, genotyped 
a panel of 128 ancestry informative markers (15) and used prin- 
cipal components analysis to estimate principal components of 
genetic ancestry (16). ARIC (17) and TRICL (11) estimated prin- 
cipal components of genetic ancestry based on GWAS data using 
EIGENSTRAT (16). These principal components of genetic 
ancestry were included in regression models to adjust for popula- 
tion substructure. 

Standard quality-assurance and quality-control measures were 
utifized to ensure genotyping quality. In PAGE (12), samples and 
SNPs were included based on call rates (>90%), concordance of 
bfinded replicates (>98%), and departures from Hardy- Weinberg 
equilibrium (P < .001). More than 97.9% of samples and more than 
99% of SNPs had a call rate equal to or greater than 95% in all four 
PAGE studies. In TRICL (1 1), samples were excluded if the average 
call rate was less than 90%; if there was sex discrepancy (threshold 
of heterozygosity >10% for men and <20% for women), unexpected 
duplicates, evidence of first-degree relatedness, or heterozygosity 
rates for autosomal chromosomes exceeding six standard deviations 
of the mean; samples with less than 80% European ancestry based on 
STRUCTURE (18) analysis, and outliers based on principal compo- 
nent analysis using EIGENSTRAT (16), were also excluded. 

Statistical Analyses 

For each study, we estimated the association between each SNP and 
risk of lung cancer using unconditional logistic regression and an 
additive genetic model of the risk allele. Models were adjusted for 
age, sex, country/study center (as appropriate), principal compo- 
nents of genetic ancestry, and smoking status (never, former, cur- 
rent). The Liverpool and Institute of Cancer Research (ICR) studies, 
which used generic control subjects, were not adjusted for age, sex, 
or smoking status. Studies with more than 85 lung cancer cases per 
racial/ethnic group were retained for race/ethnicity-stratified analy- 
sis. Associations by tumor histology were estimated based on logistic 
models of World Health Organization-defined histological cell type 
(adenocarcinoma, squamous cell carcinoma [SCC], and small cell 
lung cancer) compared to all control subjects. Large cell lung can- 
cers were not included in the histology-specific analysis due to their 
limited sample size and heterogeneous nature. Stratified analyses by 
sex and smoking status (never and ever) were also performed. 

To examine whether the associations with SNPs in TERT 
were independent of the known lung cancer risk variant in TERT 
(rs2736100) (7), conditional analysis was performed. 

The regression estimates were combined across studies using 
inverse-variance weighted, fixed-effect meta-analysis using the 
METAL program, tool for meta-analysis genomewide asso- 
ciation scans (19). The Cochran Q statistic was used to test for 
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heterogeneity by study and whether the meta-analyzed odds ratios 
(ORs) were heterogeneous by race/ethnicity, histological cell type, 
sex, and smoking status. To account for multiple testing of 165 
SNPs and 11 stratified analyses (four race/ethnicities, three his- 
tological cell types, two sexes, two levels of smoking status), we 
used a Bonferroni-corrected P value to assign statistical signifi- 
cance (a = .05/[165 SNPs*12 above mentioned tests] = 2.5 x 10"^). 
No additional associations were detected at a less stringent P value 
(e.g., .05/165 SNPs = 3 x 10~*). Statistical tests were two-sided. 

Results 

The main characteristics of the 18 023 patients with lung cancer 
and 60 543 control subjects are presented in Supplementary Table 1 
(available online). The PAGE study was comprised of European- 
ancestry, African American, Hispanic, Asian, Pacific Islander, and 
American Indian populations. The TRICL study was comprised 
only of individuals of European descent. The great majority (96%) 
of subjects were of European ancestry. Also, the majority of patients 
and control subjects were older than 50 years, with the exception of 
the Helmholtz-Gemeinschaft Deutscher Eorschungszentren Lung 
Cancer GWAS (HGF) Germany study, where all subjects were 
50 years of age or younger (3%). All studies, except WHI, were 
comprised of both sexes. In all studies, patients were more likely to 
be ever smokers and control subjects were more likely to be never 
smokers. Histology information was available for all studies, with 
the exception of ARIC. Among the studies with histology infor- 
mation, adenocarcinoma (34.0%) was the most common cell type, 
with the exception of the International Agency for Research on 
Cancer (lARC) GWAS, where SCC was more common (35.6%). 



We evaluated the association between 18 known lung risk 
variants located in previously identified lung cancer risk loci 
and risk of lung cancer among European-ancestry populations 
(Supplementary Table 2, available online). Of the 18 lung cancer 
risk variants, 16 replicated at P < .05. 

Among the 165 risk variants, 15 were nominally associated with 
lung cancer at P < .05 (Figure 1; Supplementary Table 3, available 
online), which is notably more than the eight associations expected 
by chance (i.e., 165 SNPs*. 05 = 8.3). Using a binomial distribu- 
tion with a P - .05 and n = 165 SNPs, the probability of observ- 
ing 15 or more associations is .009. These 15 associations included 
eight prostate cancer variants, four glioma variants, one breast 
cancer variant, one childhood acute lymphocytic leukemia variant, 
and one follicular lymphoma variant. Twelve of the 1 5 SNPs were 
associated with an increased risk of lung cancer in the same direc- 
tion of the known GWAS association. No heterogeneity by race/ 
ethnicity (P > .05) was noted for the 15 nominally associated SNPs 
(Supplementary Table 4, available online). 

The breast cancer SNP LSPl rs3817198 was associated with 
an increased risk of lung cancer (OR = 1.10; 95% confidence 
interval [CI] = 1.05 to 1.14) and remained statistically significant 
{P -2.8x 10"*) after correction for multiple comparisons (Figure 2). 
This association was heterogeneous by cell type (P^et = -03) and 
sex (Phet = -01), where it appeared to be hmited to adenocarcinoma 
(OR = 1.11; 95% CI = 1.05 to 1.17; P = 1.14x 10^) (Supplementary 
Table 5, available online) and women (OR = 1.16; 95% CI = 1.09 
to 1.23; P - 4.31x10"*) (Supplementary Table 6, available online). 
This association was not observed in SCC or small cell carcinoma 
{P > .35) or in men (P = .16). In stratified analysis by both sex and 
histology cell type (data not shown), among studies with available 
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Figure 1. Manhattan plot of the meta-analysis association between risk variants of 16 other cancers and lung cancer.The solid line is the Bonferroni- 
corrected significance threshold. Each association is colored according to the cancer for which the single-nucleotide polymorphism was originally 
reported, and positioned on the x-axis according to its genomic position. 
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Breast Cancer SNP LSP1 rs3817198, region 11p15.5 
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Figure 2. Forest plot of the association between lymphocyte-specific 
protein 1 (LSP1) rs3817198 and lung cancer risk. Study-specific and meta- 
analysis associations are plotted, modeling the C risk allele for breast 
cancer. Squares represent odds ratios (ORs); size of the square repre- 
sents inverse of the variance of the log ORs; horizontal lines represent 



95% confidence intervals (CIs); diamonds represent summary estimate 
combining the study-specific estimates with a fixed-effects model; solid 
vertical lines represent OR = 1; dashed vertical lines represent the overall 
ORs. The single-nucleotide polymorphism (SNP) rs3817198 was geno- 
typed in all studies. GWAS = genome-wide association study. 
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scriptase gene (TERT) rs2853676 and lung adenocarcinoma risk. Study combining the study-specific estimates with a fixed-effects model; solid 
specific and meta-analysis associations are plotted, modeling the A risk vertical lines represent OR = 1; dashed vertical lines represent the overall 
allele for glioma. Squares represent odds ratios (ORs); size of the square ORs. The single-nucleotide polymorphism (SNP) rs2853676 was geno- 
representsinverseof the variance of the log ORs; horizontal lines represent typed in ail studies. GWAS = genome-wide association study. 



data, we found that the association was present for female adenocar- 
cinoma (n = 1,607 cases, 4 studies: EAGLE-BioVU, MEC, National 
Cancer Institute Lung GWAS (NCI), and WHI; OR = 1.19; 
P- 1.2 X 10^). This association was not observed for male adenocar- 
cinoma (n = 1507, 3 studies: EAGLE-BioVU, MEC, NCI; P = .14). 
However, the test for heterogeneity in effects between rs3817198 
and adenocarcinoma by sex was not statistically significant (P = . 1 0). 



Whereas the TEi?Trs2853676 variant was only nominally asso- 
ciated with overall lung cancer {P = .001) (Supplementary Table 3, 
available online), a statistically significant association with ade- 
nocarcinoma (OR = 1.16; 95% CI = 1.10 to 1.22; P = 1.1 xlO-^) 
was observed among 5164 patients and 38 567 control subjects 
(Figure 3; Supplementary Table 5 and Supplementary Figure lA, 
available online). This SNP was not associated with either SCC 
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or small cell carcinoma (P > .18) (Phet by cell type = 3.9x 10~*). In 
a subset of six studies with available data (lARC, MD Anderson 
Cancer Center (MDACC), MEC, NCI, Samuel Lunenfeld 
Research Institute study (SLRI), and WHI), when conditioning on 
the known TERT risk variant for lung cancer (rs2736100; linkage 
disequilibrium [LD] with rs2853676 in European CEU: i-^ - 0.17), 
the association with adenocarcinoma was attenuated (OR = 1.06; 
P = .09). Alternatively, the meta-analyzed result among these six 
studies when not conditioned on rs2736100 was similar to the main 
adenocarcinoma finding (OR = 1.16; P = 1.3 x 10"'). 

The CDKN2BAS1 glioma SNP, rs4977756, was not associ- 
ated with overall lung cancer risk {P - .13) but was associated with 
sec (OR= 1.11; 95% CI= 1.07 to 1.19; P = 2.5 x 10-') (Figure 4; 
Supplementary Figure IB, available online). This SNP was not 
associated with adenocarcinoma {P - .68) or small cell carcinoma 
{P = .48) (Ph„ by cell type = .0006) (Supplementary Table 5, avail- 
able online). Independent effects between rs4977756 and the previ- 
ously reported lung cancer risk variant in 9p2 1.3 (11) could not be 
determined because only a small subset of data on the later variant 
was available. 

Among the 15 nominally statistically significant associations, 
only two associations were heterogeneous by smoking status (8q24 
rsl0090154 and 6p21.33 rs6457327) at < -05 (Supplementary 
Table 7, available online). 



Discussion 

In this large meta-analysis of 18023 lung cancer patients and 
60 543 control subjects, we examined 165 established cancer risk 
variants (excluding lung cancer and smoking-related risk variants) 



and their associations with lung cancer. This is the first study to 
systematically examine pleiotropic effects from risk variants identi- 
fied in GWASs of other malignancies on the risk of lung cancer. 
We found that die breast cancer risk allele "C" of LSPl rs3817198 
was associated with an increased risk of lung cancer. 

LSPl encodes the lymphocyte-specific protein 1, an F-actin 
bundling cytoskeletal protein. In GWAS, common variants in or 
near the gene have been associated with risk of breast cancer in 
women (20) and ulcerative colitis in men and women (21). This 
LSPl region is conserved in mice, and studies have found loss of 
heterozygosity in this region in breast and lung cancers (22,23). 
We found that this association was stronger in women for over- 
all lung cancer and for adenocarcinoma. When stratifying on both 
histology and sex, we observed an association in women with ade- 
nocarcinoma but not in men with adenocarcinoma. Furthermore, 
epidemiologic studies of familial aggregation of cancers found an 
excess of breast cancer among relatives of nonsmokers with lung 
cancer (24) and relatives of early-onset lung cancer (25), suggesting 
a genetic susceptibility across these two cancers. To confirm that 
this association was not a result of excess breast cancer cases, we 
excluded lung cancer cases with previous history of breast cancer 
and obtained similar results. The underlpng biological mechanism 
through which LSPl may influence cancer development remains 
to be elucidated. LSPl is expressed in lymphocytes, neutrophils, 
macrophages, and endothelial cells and may regulate neutrophil 
motility, adhesion to fibrinogen matrix proteins, and transendothe- 
lial migration (26). 

Risk variants in or near the TERT-CLPTMIL locus have been 
associated with risk of several cancer sites (6), including adeno- 
carcinoma of the lung (6,7,9,27). TERT encodes for telomerase 
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Figure 4. Forest plot of the association between cyclin-dependent 
kinase 4 inhibitor B antisense RNA 1 {CDKN2BAS1) rs4977756 and 
lung squamous cell carcinoma risk. Study specific and meta-analy- 
sis associations are plotted, modeling the G risk allele for glioma. 
Squares represent odds ratios (ORs); size of the square represents 
inverse of the variance of the log ORs; horizontal lines represent 



95% confidence intervals (CIs); diamonds represent summary esti- 
mate combining the study-specific estimates with a fixed-effects 
model; solid vertical lines represent OR = 1; dashed vertical lines rep- 
resent the overall ORs. The single-nucleotide polymorphism (SNP) 
rs4977756 was genotyped in all studies. GWAS = genome-wide asso- 
ciation study. 
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reverse transcriptase, which maintains telomere length through 
each cell division. Telomere shortening is associated with increased 
genomic instability, thereby increasing the risk of cancer develop- 
ment. The "A" allele of rs2853676, located in intron 2 of TERT, 
was initially reported to be associated with an increased risk of gli- 
oma (28). In our study, we found a strongly statistically significant 
association with adenocarcinoma and notable heterogeneity by 
histological cell type. Consistent with our findings, the NCI study, 
which is part of TRICL, reported a modest association between 
rs2853676 and adenocarcinoma (P - 3.4x10"*) (7). This same 
study identified TERT rs2736lOO, also located in intron 2, to be 
associated with a 12% increase in lung cancer risk (P - 1.6 x 10"'°) 
(7). Whereas rs2853676 is in low LD with rs2736100 (European 
CEU: 7-^ - .17), results from our conditional analysis suggest that 
the association between rs2853676 and adenocarcinoma may not 
be independent of rs2736100. In addition, a recent Japanese study 
found that TERT rs2S 5 3 67 7 (CEU: r' = 0.59) is associated with 
lung adenocarcinoma (P - 3.1 x 10^) (29). However, because this 
SNP was not genotyped in our study, we were unable to condi- 
tion on rs2853677. It is possible that the association between 
rs2853676 and adenocarcinoma may be influenced by rs2736100 
and rs2853677. 

We found that rs4977756 at 9p21.3 was associated with SCC. 
This SNP is located in CDKN2BAS1, a long noncoding RNA 
region, and near the cluster of two tumor suppressor genes, 
CDKN2A and CDKN2B. CDKN2BAS1 has been imphcated in 
the development of multiple chronic diseases and cancers, due 
to the role of CDKN2A and CDKN2B in cell cycle inhibition, 
senescence, and stress-induced apoptosis (30). Furthermore, three 
CDKN2BAS1 spliced variant transcripts expressed in lung cancer 
cell lines (31) have been shown to have various enhancer activi- 
ties (32). The SNP rs4977756 has been previously associated with 
glioma (28,33) and glaucoma (34). A recent meta-analysis of lung 
cancer GWASs by TRICL found rsl333040, which is approxi- 
mately 74 kb upstream from CDKN2B, to be associated with lung 
cancer (OR = 1.06; P - 9 Ax 10"'), with a stronger association for 
SCC (OR = 1.14; P = 2.9x10 ') (11). Among European-ancestry 
populations, there is little LD between rsl333040 and rs4977756 
(CEU + Toscans in Italy [TSI]: - 0.27). However, because only 
two studies had genotype data for rsl333040, we were unable to 
examine the independent effects of the two SNPs. Further evalua- 
tion of rs4977756 and SCC risk is needed. 

Our finding of pleiotropy between the breast cancer risk locus 
at LSPl and lung cancer risk points toward shared etiologic mecha- 
nisms for these two cancer sites. Concurrently, we observed cell 
type-specific effects for lung cancer with two variants located in 
cancer pleiotropic regions {TERT and risk of lung adenocarcinoma 
and CDKN2BAS1 with risk of lung SCC), indicating distinct eti- 
ological processes for these two subtypes. These observations of 
shared and distinct effects with particular genetic loci are consist- 
ent with other studies of lung cancer. For example, EGFR kinase 
domain mutations are frequent in lung adenocarcinoma of non- 
smokers and extremely rare in lung SCC (35). Alternatively, the 
EGFR variant III mutations have been found in lung SCC and 
gliomas (36), but not in lung adenocarcinoma (35). These find- 
ings demonstrate the complexity of carcinogenesis and the need to 
study both shared and distinct etiological pathways. 



Study limitations include reduced power to detect effects for some 
of the 165 SNPs. Nonetheless, 72% of the SNPs were genotyped 
in more than 50% of studies. Due to the limited size of the popula- 
tions of non-European descent, we were unable to fully examine 
the generalizability of effects across these populations. Additionally, 
with the available data, we could only test in a subset of studies the 
independence of the TERT rs2S5i676 association from the previ- 
ously reported TERT associations. Thus, the associations that we 
observed with TERT rs2S53676 and CDKN2BAS1 rs4977756 may 
reflect weak LD with previously identified lung cancer risk vari- 
ants in these regions. However, because the functional SNPs for 
these regions remain unknown, our findings are informative for 
future studies (e.g., fine-mapping, functional and population-spe- 
cific generalizability studies). Furthermore, we recognize the need 
to study the additional risk loci identified by more recent cancer 
GWASs. Last, as the majority of our controls excluded all cancers, 
there may have been a greater likelihood of observing associations 
with the cancer risk variants studied. However, in the MEC, where 
control selection allowed inclusion of subjects with other cancers 
than lung cancer, the associations for the top SNPs were consistent 
with the overall findings. Our study strengths include the systematic 
"candidate-SNP" approach based on strong prior evidence of an 
association from GWASs of cancer, the large sample size from well- 
characterized epidemiologic lung cancer studies, and the power to 
examine these associations by cell type, smoking status, and sex. 

In conclusion, the breast cancer SNP LSPl rs3817198 was asso- 
ciated with lung cancer risk. Our results support the influence of 
non-lung cancer risk variants on the risk of lung cancer, and these 
associations may differ by histological cell type and sex. Molecular 
studies are needed to better characterize these genetic effects and 
to increase our understanding of the apparent heterogeneity of 
effects across sex and histological cell type. 
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